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f^ ■ Abstract 

' We introduce a simple tree growth process that gives rise to a new two-parameter fam- 

ily of discrete fragmentation trees that extends Ford's alpha model to multifurcating trees 
and includes the trees obtained by uniform sampling from Duquesne and Le Gall's stable 
^■f-N , continuum random tree. We call these new trees the alpha-gamma trees. In this paper, we 

obtain their splitting rules, dislocation measures both in ranked order and in sized-biased 
order, and we study their limiting behaviour. 
AMS 2000 subject classifications: 60J80. 
Keywords: Alpha-gamma tree, splitting rule, sampling consistency, self-similar fragmenta- 
rS^ • tion, dislocation measure, continuum random tree, R-tree, Markov branching model 



PLh 



in 

p 

o 



1 Introduction 

^ Markov branching trees were introduced by Aldous [3j as a class of random binary phylogenetic 

^ ■ models and extended to the multifurcating case in p,6j . Consider the space T„ of combinatorial 

in . trees without degree-2 vertices, one degree-1 vertex called the ROOT and exactly n further degree- 

1 vertices labelled by [n] = {1, . . . ,n} and called the leaves; we call the other vertices branch 
points. Distributions on T„ of random trees T* are determined by distributions of the delabelled 
tree T° on the space T° of unlabelled trees and conditional label distributions, e.g. exchangeable 
00 ! labels. A sequence {T°,n > 1) of unlabehed trees has the Markov branching property if for 

^^ ' all n > 2 conditionally given that the branching adjacent to the ROOT is into tree components 

whose numbers of leaves are rii, . . . ,nfc, these tree components are independent copies of T°, 
^ , 1 < i < k. The distributions of the sizes in the first branching of T°, n > 2, are denoted by 

C^ I ^(^i; • • • ) nk), ni > . . . > Uf; > 1, k >2 : ni -\- . . . -\- n^ = n, 

and referred to as the splitting rule of {T°,n > 1). 

Aldous [3j studied in particular a one-parameter family (/? > —2) that interpolates between 
several models known in various biology and computer science contexts (e.g. /? = —2 comb, 
P = —3/2 uniform, /3 = Yule) and that he called the beta- splitting model, he sets for /? > —2: 

^Aldous^^_^^^^ = ("'\B{m-^l + P,n-m+l+p), for 1 < m < n/2, 

■^n \fn/ 

g^l'i°-(n/2, n/2) = ^ f M B{n/2 + 1 + /?, n/2 + 1 + /?), if n even, 

where B{a,b) = r(a)r(6)/r(a -|- b) is the Beta function and Z„, n > 2, are normalisation 
constants; this extends to /3 = — 2 by continuity, i.e. q_}f°^^{n — 1, 1) = 1, n > 2. 
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For exchangeably labelled Markov branching models {Tn,n > 1) it is convenient to set 



mil . . . rn,^! 



p{ni,...,nk) ■■= /"^ "' q{{ni,...,nk)^), rij > l,j e [k];k > 2 : n = ni + ... + nk, (1) 



\ni, 



where (ni, . . . ,nj^)^ is the decreasing rearrangement and rrir the number of rs of the sequence 
(ni, . . . ,nfc). The function p is called exchangeable partition probability function (EPPF) and 
gives the probability that the branching adjacent to the ROOT splits into tree components with 
label sets {Ai,...^Ak} partitioning [n], with block sizes Uj = #Aj. Note that p is invariant 
under permutations of its arguments. It was shown in [19] that Aldous's beta-splitting models 
for (5 > —2 are the only binary Markov branching models for which the EPPF is of Gibbs type 

Aldous/ „ \ WniWn2 „ ^ i „ \ i ■ „„„4.- „i„„ „.. ^(^ ~ ") 



P-i-^{ni,n2) = -=^ — -, ni > l,n2 > 1, in particular Wn 



Zni+n2 r(l - a) 

and that the multifurcating Gibbs models are an extended Ewens-Pitman two-parameter family 
of random partitions, < a < 1, 9 > —2a, or — oo < a < 0, 9 = —ma for some integer m >2, 

PD*/ \ '^fc TT 1 T(n — a) ^ i,_2T(k + 9/a) ,^, 

Pa,e {ni,...,nk) = — [[wn^, where m„ = — --— and a/, = q Y(2 + 9/a) ' ^^ 

boundary cases by continuity. Ford [12] introduced a different binary model, the alpha model, 
using simple sequential growth rules starting from the unique elements Ti € Ti and T2 G T2: 

(i) given r„ for n > 2, assign a weight 1 — a to each of the n edges adjacent to a leaf, and a 
weight a to each of the n — 1 other edges; 

(ii)^ select at random with probabilities proportional to the weights assigned by step (i)^, an 
edge of Tn, say a„ — > c„ directed away from the ROOT; 

(iii)^ to create Tn+i from Tn, replace a„ — > Cn by three edges Un ^- bn, bn —>■ Cn and 6„ — > n -|- 1 
so that two new edges connect the two vertices a„ and c„ to a new branch point bn and a 
further edge connects 6„ to a new leaf labelled n + 1. 

It was shown in [T^] that these trees are Markov branching trees but that the labelling is not 
exchangeable. The splitting rule was calculated and shown to coincide with Aldous's beta- 
splitting rules if and only if a = 0, a = 1/2 or a = 1, interpolating differently between Aldous's 
corresponding models for /? = 0, /? = —3/2 and /? = —2. This study was taken further in [161I23J. 
In this paper, we introduce a new model by extending the simple sequential growth rules to 
allow multifurcation. Specifically, we also assign weights to vertices as follows, cf. Figure [1) 

(i) given T„ for n > 2, assign a weight 1 — a to each of the n edges adjacent to a leaf, a weight 
7 to each of the n — 1 other edges, and a weight {k — l)a — 7 to each vertex of degree 
k + l>3; 

(ii) select at random with probabilities proportional to the weights assigned by step (i), 

• an edge of r„, say a„ — > c„ directed away from the ROOT, 

• or, as the case may be, a vertex of Tn, say Vn] 

(iii) to create T„+i from Tn, do the following: 

• if an edge a„ —<■ Cn was selected, replace it by three edges an ^> bn, bn ^> c„ and 
5„ ^ n -|- 1 so that two new edges connect the two vertices a„ and c„ to a new branch 
point bn and a further edge connects bn to a new leaf labelled n + \; 

• if a vertex Vn was selected, add an edge u^ — > n -|- 1 to a new leaf labelled n + 1. 




Figure 1: Sequential growth rule: displayed is one branch point of T„ with degree k + 1, hence 
vertex weight {k — l)a — 7, with k — r leaves Lr+i, . . . , L^ G [n] and r bigger subtrees Si, . . . ,Sr 
attached to it; all edges also carry weights, weight 1 — a and 7 are displayed here for one leaf 
edge and one inner edge only; the three associated possibilities for T^+i are displayed. 

We call the resulting model the alpha-gamma m,odel. These growth rules satisfy the rules of 
probability for all < a < 1 and < 7 < a. They contain the growth rules of the alpha 
model for 7 = 0. They also contain growth rules for a model [18^ |20] based on the stable tree 
of Duquesne and Le Gall [7], for the cases 7 = ! — a, l/2<a<l, where all edges are given the 
same weight; we show here that these cases 7 = ! — a, l/2<a<l, as well as a = 7 = form 
the intersection with the extended Ewens-Pitman-type two-parameter family of models ([2]). 

Proposition 1. Let {Tn,n > 1) be alpha-gam,m,a trees with distributions as implied by the 
sequential growth rules (i)-(iii) for some < a < 1 and < 7 < q. Then 

(a) the delabelled trees T°, n > 1, have the Markov branching property. The splitting rules are 



C^(m,... 



nk) 



oc 



7 + (l 



a 



7) 



1 



n{n — 1 






p£). 



-a— 7 



{ni, 



,nk) 



(3) 



in the case < a < 1, where q^^a-'y ^-^ ^^^ splitting rule associated via (QP with p^^_a-j, 
the Ewens-Pitman-type EPPF given in (0j, and LHS oc RHS means equality up to a mul- 
tiplicative constant depending on n and (a, 7) that makes the LHS a probability function; 

(b) the labelling ofT^ is exchangeable for all n > 1 if and only i/7 = 1 — a, 1/2 < a < 1. 

For any function (ni,...,nfc) 1-^ q{ni, . . . ,nk) that is a probability function for all fixed 
n = ni + . . . + rifc, n > 2, we can construct a Markov branching model (T°,n > 1). A 
condition called sampling consistency [3J is to require that the tree T° _i constructed from T° 
by removal of a uniformly chosen leaf (and the adjacent branch point if its degree is reduced 
to 2) has the same distribution as T°_i, for all n > 2. This is appealing for applications with 
incomplete observations. It was shown in [16] that all sampling consistent splitting rules admit 
an integral representation (c, v) for an erosion coefficient c > and a dislocation measure u 
on S^ = {s = {si)i>i : si > S2 > . . . > 0, si + S2 + . . . < 1} with zy({(l, 0, 0, . . .)}) = and 
Jg^(l — si)i'{ds) < 00 as in Bertoin's continuous-time fragmentation theory [HEIE]- In the most 
relevant case when c = and u{{s ^ S^ : si -\- S2 + ■ ■ ■ < 1}) = 0, this representation is 

p{ni,...,nk) = -^ ^ J|s"^'z^(ds), Uj > l,j e [k];k > 2 : n = ni + . . . + Uk, (4) 



distinct 



«fc>i J=l 



where Z^ = /ci(l — Yli>i '5F)^('^s), n > 2, are the normahzation constants. The measure v is 
unique up to a multiplicative constant. In particular, it can be shown |20l[T7] that for the Ewens- 
Pitman EPPFs p^^* we obtain u = PD* g{ds) of Poisson-Dirichlet type (hence our superscript 
PD* for the Ewens-Pitman type EPPF), where for < a < 1 and 9 > —2a we can express 

for an a-stable subordinator a with Laplace exponent — log(E(e~'^'^^)) = A" and with ranked 
sequence of jumps A(T[o^i] = {Aat,t G [0, 1])^-. For a < 1 and 9 = —2a, we have 

/ /(5)PD;_2„(ds) = / f{x,l-x,0,0,...)x-''-\l-xr^-'dx. 

J5i Jl/2 

Note that z^ = PD* ^ is infinite but a-finite with /ci(l — si)i'{ds) < oo for —2a < 9 < —a. This 
is the relevant range for this paper. For 9 > —a, the measure PD* g just defined is a multiple of 
the usual Poisson-Dirichlet probability measure PDq, 51 on S^, so for the integral representation 
of p^^* we could also take u = PD^ g in this case, and this is also an appropriate choice for the 
two cases a = and m > 3; the case a = 1 is degenerate q^^* {I, !,...,!) = 1 (for all 9) and 



can be associated with u = PD* g = 5(o,o,...)> see [19]. 

Theorem 2. The alpha- gamma- splitting rules qa^ are sampling consistent. For < a < 1 and 
< 7 < a the measure v in the integral representation can he chosen as 

uanids) =7+(l-a-7)5^ s^Sj PB*^_^^^{ds). (5) 

The case a = 1 is discussed in Section 13. 2i We refer to Griffiths [1^ who used discounting 
of Poisson-Dirichlet measures by quantities involving X^j^,- SiSj to model genie selection. 

In [16], Haas and Miermont's self-similar continuum random trees (CRTs) [15] are shown 
to be scaling limits for a wide class of Markov branching models. See Sections 13.31 and 13.61 for 
details. This theory applies here to yield: 

Corollary 3. Let {T°,n > 1) be delabelled alpha-gamma trees, represented as discrete M-trees 
with unit edge lengths, for some < a < 1 and < 7 < a. Then 

rjio 

— ^ — > T"'''' in distribution for the Gromov-Hausdorff topology, 

n'y 

where the scaling n^ is applied to all edge lengths, and T°''^ is a ^- self- similar CRT whose 
dislocation measure is a multiple ofv^^-^- 

We observe that every dislocation measure v on S^ gives rise to a measure v^^ on the space 
of summable sequences under which fragment sizes are in a size-biased random order, just as 
the GEMq, distribution can be defined as the distribution of a PDq, ^ sequence re-arranged in 
size-biased random order ^22j. We similarly define GEM* ^ from PD* g. One of the advantages 
of size-biased versions is that, as for GEMq, g, we can calculate marginal distributions explicitly. 

Proposition 4. For < a < 1 and < 7 < a, distributions z^| of the first k > 1 marginals of 
the size-biased form v^„^ of Ua^y are given, for x = (xi, . . . , x^), by 

-f(rfx)=(7 + (l-«-7)(l-E^f-Y^(^£^(^l-X:^.) j j GEM;_„_,(dx). 
The other boundary values of parameters are trivial here - there are at most two non-zero parts. 



We can investigate the convergence of Corollary [3] when labels are retained. Since labels 
are non-exchangeable, in general, it is not clear how to nicely represent a continuum tree with 
infinitely many labels other than by a consistent sequence TZk of trees with k leaves labelled [k] , 
k > 1. See however |23j for developments in the binary case 7 = a on how to embed 7^^, k > 1, 
in a CRT T"'°'. The following theorem extends Proposition 18 of [16] to the multifurcating case. 



Theorem 5. Let {Tn,n > 1) be a sequence of trees resulting from the alpha- gamma-tree growth 
rules for some < a < 1 and < 7 < a. Denote by R{Tn, [k]) the subtree of T^ spanned by the 
ROOT and leaves [k\, reduced by removing degree-2 vertices, represented as discrete M-iree with 
graph distances in T„ as edge lengths. Then 

— > TZk fl-S- in the sense that all edge lengths converge, 

n'y 

for some discrete tree TZ^ with shape T^. and edge lengths specified in terms of three random 
variables, conditionally independent given that T^ has k + d. edges, as L/^W^Di. with 

• Wk ~ beta(A;(l — a) + ^7, {k — l)a — £7), where beta(a,6) is the beta distribution with 
density 5(a,6)-^a;"-^(l - a;)^-il(o,i)(x); 



• Lif with density —- r-r^s "*" Q-y(s), where q^ is the Mittaq-Leffler den- 

^ r(l + ^ + fc(l-a)/7) ^^ ' ^ 

sity, the density of a^ '^ for a subordinator a with Laplace exponent X^ ; 

• -Dfc ~ Dirichlet((l— a)/7, . . . , (1— a)/7, 1, . . . , 1), where Dirichlet(ai, . . . , Om) is the Dirich- 
let distribution on A^ = {{xi, ■ ■ ■ , Xm) £ [0, 1]™' : xi + . . . + Xm = 1} with density of the 
first m — 1 marginals proportional to a;"^~ • • • x'^Si (1 — xi — ... — Xm-i)"""^^^ ; here D^ 
contains edge length proportions, first with parameter (1 — a)/7 for edges adjacent to leaves 
and then with parameter 1 for the other edges, each enumerated e.g. by depth first search. 

In fact, 1 — Wk captures the total limiting leaf proportions of subtrees that are attached on 
the vertices of T^, and we can study further how this is distributed between the branch points, 
see Section W?2\ 

We conclude this introduction by giving an alternative description of the alpha-gamma model 
obtained by adding colouring rules to the alpha model growth rules (i)^-(iii)^, so that in T™' 
each edge except those adjacent to leaves has either a blue or a red colour mark. 

(iv)'^°^ To turn T„+i into a colour-marked tree T^+i, keep the colours of T™' and do the following: 

• if an edge a^ — > Cn adjacent to a leaf was selected, mark a^ -^ bn blue; 

• if a red edge a„ — > c„ was selected, mark both a„ -^ bn and 6„ — > c„ red; 

• if a blue edge a„ — > c„ was selected, mark a„ -^ bn blue; mark bn — > Cn red with 
probability c and blue with probability 1 — c; 

When {Tn ,n > 1) has been grown according to (i) -(iii) and (iv) , crush all red edges, i.e. 

(cr) identify all vertices connected via red edges, remove all red edges and remove the remaining 
colour marks; denote the resulting sequence of trees by (T„,n > 1); 

Proposition 6. Let {Tn,n > 1) be a sequence of trees according to growth rules (i) -(iii) ,(iv) 
and crushing rule (cr). Then {Tn,n > 1) is a sequence of alpha-gamma trees with 7 = a(l — c). 

The structure of this paper is as follows. In Section 2 we study the discrete trees grown 
according to the growth rules (i)-(iii) and establish Proposition [6] and Proposition [1] as well as 
the sampling consistency claimed in Theorem [2J Section 3 is devoted to the limiting CRTs, we 
obtain the dislocation measure stated in Theorem [2] and deduce Corollary [3] and Proposition [H 
In Section 4 we study the convergence of labelled trees and prove Theorem O 



2 Sampling consistent splitting rules for the alpha-gamma trees 

2.1 Notation and terminology of partitions and discrete fragmentation trees 

For B Q N, let Vb be the set of partitions of B into disjoint non-empty subsets called blocks. 
Consider a probability space (17,^, P), which supports a 'P^-valued random partition 11^. If 
the probability function of 11^ only depends on its block sizes, we call it exchangeable. Then 

P(nB = {Ai,..., Ak]) = p{#Ai, ..., #Ak) for each partition vr = {Ai, . . . , AJ G Vb, 

where i^Aj denotes the block size, i.e. the number of elements of Aj. This function p is called 
the exchangeable partition probability function (EPPF) of Hb- Alternatively, a random partition 
Hb is exchangeable if its distribution is invariant under the natural action on partitions of B by 
the symmetric group of permutations of B. 

Let S C N, we say that a partition vr G Vb is finer than it' G Vb, and write vr ^ it', if any 
block of vr is included in some block of vr'. This defines a partial order ^ on Vb- A process or 
a sequence with values in Vb is called refining if it is decreasing for this partial order. Refining 
partition-valued processes are naturally related to trees. Suppose that S is a finite subset of N 
and t is a collection of subsets of B with an additional member called the ROOT such that 

• we have i? G t; we call B the common ancestor of t; 

• we have {i} G t for all i G i?; we call {i} a leaf oi t; 

• for all ^ G t and C G t, we have either A f] C = , or A Q C or C Q A. 

If A C C, then A is called a descendant of C, or C an ancestor of A. If for all Z? G t with 
A C D Q C either A = D or D = C, we call A a child of C, or C the parent of A and denote 
C ^ A. If we equip t with the parent-child relation and also ROOT -^ B, then t is a rooted 
connected acyclic graph, i.e. a combinatorial tree. We denote the space of such trees t by T^ 
and also T„ = Tr„i. For t G T^ and A G t, the rooted subtree s^ of t with common ancestor 
A is given by s^ = {root} U {C G t : C C A} G T^. In particular, we consider the subtrees 
Sj = Syi^^ of the common ancestor B of t, i.e. the subtrees whose common ancestors Aj, j G [k], 
are the children of B. In other words, si , . . . , s^ are the rooted connected components of t \ {B}. 
Let (7r(t),t > 0) be a "Ps-valued refining process for some finite B CN with 7r(0) = 1b and 
7r(t) = 0^ for some t > 0, where 1b is the trivial partition into a single block B and 0^ is the 
partition of B into singletons. We define tjr = {root} U {A C S : A G 7r(t) for some t > 0} as 
the associated labelled fragmentation tree. 

Definition 1. Let B CN with ^B = n and t G T^. We associate the relabelled tree 

t'^ = {root} U {cr{A) : A G t} G T„, 
for any bijection cr : B ^ [n], and the combinatorial tree shape of t as the equivalence class 

t° = {t''\a -.B ^ [n] bijection} C T„. 

We denote by T° = {t° : t G T„} = {t° : t G T^} the collection of all tree shapes with n leaves, 
which we will also refer to in their own right as unlabelled fragmentation trees. 

Note that the number of subtrees of the common ancestor of t G T„ and the numbers of 
leaves in these subtrees are invariants of the equivalence class t° C T„. If t° G T° has subtrees 
s^, . . . , s^ with ni > . . . > rifc > 1 leaves, we say that t° is formed by joining together s^, . . . , s^, 
denoted by t° = sJ * . . . * s^. We call the composition (rii, . . . , n^) of n the first split of t° . 

With this notation and terminology, a sequence of random trees T° G T° , n > 1, has the 
Markov branching property if, for all n > 2, the tree T° has the same distribution as 5^'*. . ■*S'^ , 
where iVi > . . . > Nk„ > 1 form a random composition of n with Kn > 2 parts, and conditionally 
given Kn = k and Nj = Uj , the trees S°,j G [k] , are independent and distributed asT°.,j G [k] . 



2.2 Colour-marked trees and the proof of Proposition O 

The growth rules (i)^-(iii)^ construct binary combinatorial trees T^™ with vertex set 

y = {ROOT}U[n]U{6i,...,6„_i} 

and an edge set E CV xV. We write t> — > tn if (v, w) G E. In Section [2. 11 we identify leaf i with 
the set {i} and vertex bi with {j G [n] : 6j — > . . . — > j}, the edge set E then being identified by 
the parent-child relation. In this framework, a colour mark for an edge v ^> bi can be assigned 
to the vertex bi, so that a coloured binary tree as constructed in (iv)™' can be represented by 

y-l = {root} U [n\ U {(6l,Xn(6l)), . . . , (6„„i,Xn(fen-l))} 

for some Xn{bi) S {0, 1}, i £ [n — 1], where represents red and 1 represents blue. 

Proof of Proposition\^ We only need to check that the growth rules (i)^-(iii)^ and (iv)™' for 
(T^ ,n > 1) imply that the uncoloured multifur eating trees (T„, n > 1) obtained from (T^ , n > 
1) via crushing (cr) satisfy the growth rules (i)-(iii). Let therefore f^^^ be a tree with P(T^^^ = 
t^°}.i) > 0. It is easily seen that there is a unique tree t™', a unique insertion edge a^°^ —)■ c™' in 
t™' and, if any, a unique colour Xn+i(c5f') to create tj^"^;^ from t™'. Denote the trees obtained 
from t™^ and f^^-y via crushing (cr) by t„ and t„+i. If Xn+i(c™') = 0, denote by A; + 1 > 3 the 
degree of the branch point of t„ with which c^°^ is identified in the first step of the crushing (cr) . 

• If the insertion edge is a leaf edge (c^° = i for some i G [n] ) , we obtain 

P(f„+i = t„+i|f„ = t„,r-i = t-i) = (1 - a)/{n - a). 

• If the insertion edge has colour blue (Xn(c^°^) = 1) and also Xn+i(c5f') = 1, we obtain 

P(f„+i = t„+i|f„ = t„,T™i = t™i) = a(l - c)/{n - a). 

• If the insertion edge has colour blue (Xn(c5f') = 1), but Xn+i(c^°^) = 0, or if the insertion 
edge has colour red (Xn(Cn°') = Oi ^-nd then necessarily Xn+i(c™') = also), we obtain 

P(T„+i = t„+i|r„ = t„,T™' = tf ) = (ca + (A; - 2)a)/(n - a), 

because apart from a„ ^ c„ , there are k — 2 other edges in t„ , where insertion and 
crushing also create tn+i- 

Because these conditional probabilities do not depend on t™^ and have the form required, we 
conclude that {Tn,n > 1) obeys the growth rules (i)-(iii) with 7 = a(l — c). D 

2.3 The Chinese Restaurant Process 

An important tool in this paper is the Chinese Restaurant Process (CRP), a partition- valued 
process (n„, n > 1) due to Dubins and Pitman, see [22], which generates the Ewens-Pitman two- 
parameter family of exchangeable random partitions IIoo of N. In the restaurant framework, 
each block of a partition is represented by a table and each element of a block by a customer at 
a table. The construction rules are the following. The first customer sits at the first table and 
the following customers will be seated at an occupied table or a new one. Given n customers at 
k tables with Uj > 1 customers at the jth. table, customer n + 1 will be placed at the jth table 
with probability {rij — a)/{n + 9), and at a new table with probability (9 + ka)/{n + 9). The 
parameters a and 9 can be chosen as either a < and 9 = —ma for some mENorO<a<l 
and 9 > —a. We refer to this process as the CRP with (a, 9)-seating plan. 

In the CRP (n„,n > 1) with n„ € VinU we can study the block sizes, which leads us 
to consider the proportion of each table relative to the total number of customers. These 
proportions converge to limiting frequencies as follows. 



• S a.s. as n ^ oo, 



Lemma 7 (Theorem 3.2 in |22]). For each pair of parameters {a, 9) subject to the constraints 
above, the Chinese restaurant with the {a, 0) -seating plan generates an exchangeable random 
partition Hoc ofN. The corresponding EPPF is 

v^^(n, ,,_ ^'-'nk + e/ani + e) ' T{n,-a) ^. > ^ ^ ^ ,j,y ^ > I ■ V n- - n 

boundary cases by continuity. The corresponding limiting frequencies of block sizes, in size-biased 
order of least elements, are GEM^^g and can be represented as 

(A, ^2, . . .) = iWi,WiW2,WiW2W3, . . .) 

where the Wi are independent, Wi has beta(l — a, + ia) distribution, and Wi := 1 — Wi. The 
distribution of the associated ranked sequence of limiting frequencies is Poisson-Dirichlet PD^fi ■ 

We also associate with the EPPF p^^ the distribution q^^ of block sizes in decreasing order 
via ([1]) and, because the Chinese restaurant EPPF is not the EPPF of a splitting rule leading 
to A; > 2 block (we use notation g™* for the splitting rules induced by conditioning on A; > 2 
blocks), but can lead to a single block, we also set q^^{n) = P^^in). 

The asymptotic properties of the number Kn of blocks of n„ under the (a, 0)-seating plan 
depend on a: if a < and 9 = —ma for some m € N, then K^ = rn for all sufficiently large n 
a.s.; if a = and > 0, then lim^^oo Kn/^ogn = 6 a.s. The most relevant case for us is a > 0. 

Lemma 8 (Theorem 3.8 in [22]). For < a < I, 9 > —a, , 

K, 

n 

where S has a continuous density on (0, oo) given by 

and ga is the density of the Mittag-Leffler distribution with pth moment T{p + 1)/T{pa + 1). 

As an extension of the CRP, Pitman and Winkel in [23J introduced the ordered CRP. Its 
seating plan is as follows. The tables are ordered from left to right. Put the second table to the 
right of the first with probability 9 /{a + 9) and to the left with probability a /{a + 9). Given k 
tables, put the (A: + l)st table to the right of the right-most table with probability 9/{ka + 9) 
and to the left of the left-most or between two adjacent tables with probability a/{ka-\-9) each. 

A composition of n is a sequence (ni, . . . , n^) of positive numbers with sum n. A sequence 
of random compositions C„ of n is called regenerative if conditionally given that the first part of 
Cn is rei, the remaining parts of C„ form a composition of n — ni with the same distribution as 
Cn-m- Given any decrement matrix {q'^'^'^{n,m), 1 <m<n), there is an associated sequence C„ 
of regenerative random compositions of n defined by specifying that q {n, •) is the distribution 
of the first part of C„. Thus for each composition (ni, . . . , n^) of n, 

P(C„ = (ni,...,nfc)) = q'^'"'{n,ni)q'^''''{n-ni,n2) ■..q'^'"'{nk-i + nk,nk-i)q'^'"'{nk,nk). 

Lemma 9 (Proposition 6 (i) in [23j). For each {a, 9) with < a < 1 and 9 > 0, denote by Cn 
the composition of block sizes in the ordered Chinese restaurant partition with parameters (a, 9). 
Then {Cn,n > 1) is regenerative, with decrement matix 

Apr, ^ f n\ (n — m)a + m9T(m — a)T(n — m + 9) ,, , ,„, 

qf^e{n,m) = ( Y- ^^ Vn „^V.^ , a^ (1 < m < n). (6) 



ml n r(l — a)T{n + 



2.4 The splitting rule of alpha-gamma trees and the proof of Proposition [T] 

Proposition[T]claims that the unlabelled alpha-gamma trees {T°,n > 1) have the Markov branch- 
ing property, identifies the sphtting rule and studies the exchangeability of labels. In preparation 
of the proof of the Markov branching property, we use CRPs to compute the probability function 
of the first split of T° in Proposition [TOl We will then establish the Markov branching property 
from a spinal decomposition result (Lemma II ip for T°. 

Proposition 10. Let T° be an unlabelled alpha-gamma tree for some < q < 1 and < 7 < a, 
then the probability function of the first split of T° is 

q^^^{ni,...,nk) = -^ ^ 7+(l-a-7)— T^^Z^^'^i ql^-a-^{ni, . . . ,nk), 

ni > . . . > n^ > I, k > 2: ui + . . . + n^ = n, where Z^ is the normalisation constant in ^. 

Proof. We start from the growth rules of the labelled alpha-gamma trees T„. Consider the spine 
ROOT ^ vi ^ . . . — > vl„_i — > 1 of r„, and the spinal subtrees S^^, 1 < i < Ln-i, 1 < i < Kn,i, 
not containing 1 of the spinal vertices Vi, i a [L„_i]. By joining together the subtrees of the 
spinal vertex Vi we form the ith spinal bush S^^ = S^f * ... * S^^ . . Suppose a bush 5*^^^ consists 
of k subtrees with m leaves in total, then its weight will he m — ka — 'j + ka = m — 'j according 
to growth rule (i) - recall that the total weight of the tree r„ is n — a. 

Now we consider each bush as a table, each leaf n = 2, 3, ... as a customer, 2 being the first 
customer. Adding a new leaf to a bush or to an edge on the spine corresponds to adding a new 
customer to an existing or to a new table. The weights are such that we construct an ordered 
Chinese restaurant partition of N \ {1} with parameters (7, 1 — a). 

Suppose that the first split of T„ is into tree components with numbers of leaves ni > . . . > 
nk > i- Now suppose further that leaf 1 is in the subtree with Ui leaves in the first split, then 
the first spinal bush S'^ will have n — Ui leaves. Notice that this event is equivalent to that 
oi n — Hi customers sitting at the first table with a total of n — 1 customers present, in the 
terminology of the ordered CRP. According to Lemma [9l the probability of this is 



q^^i_a{n-l,n-ni) 



{ui - 1)7+ (n-nj)(l - 


- a) T{ni - a)T{n -ni-^) 


n-l 

(ui ni{n-ni) 
\ n n{n — Ij 


r(n-a)r(l-7) 

\ r(n, - a)T{n - n, - 7) 

" ^V r(n-a)r(i-7) 



n — 1 

n 
^n — Ui 

Next consider the probability that the first bush S^ joins together subtrees with ^i > . . . > 
ni-i > nj+i > . . . nfc > 1 leaves conditional on the event that leaf 1 is in a subtree with Ui leaves. 
The first bush has a weight of n — n^ — 7 and each subtree in it has a weight of nj — a,j 7^ i. 
Consider these k — 1 subtrees as tables and the leaves in the first bush as customers. According 
to the growth procedure, they form a second (unordered, this time) Chinese restaurant partition 
with parameters (a, —7), whose EPPF is 



PD . „ „ ^ _ «^-^r(fc-l-7/a)r(l-7) T-r Tjnj - a) 



Let rrij be the number of j's in the sequence of (ni, . . . ,nfc). Based on the exchangeability of 
the second Chinese restaurant partition, the probability that the first bush consists of subtrees 
with ni > . . . > Ui-i > rij+i > . . . > n^ > 1 leaves conditional on the event that leaf 1 is in one 
of the nin subtrees with rij leaves will be 



mn, / n-n^ , po 

I I m,„ ! \ rf.i n.A i . n.- i i n.i^ I '^'^'^ 



mi! . . .m„! vni,...,nj_i,nj+i,... ,nfc 



Pa°7("i'--- ,nj_i,ni+i,...,nfc). 



Thus the joint probabihty that the first split is {rii, . . . , n^) and that leaf 1 is in a subtree with 
Hi leaves is, 

^^rii I n — rii \ J p-p) 
j r Q-yA-aVn - 1, n - ni)p [ni, ..., n^-i, n^+i, . . . , n/,) 

nj_^ min-rii)^^ ^ _^^ Z„r(l - a) ^P^. 



= "^»-^n^+ n(n-l) (^ " « " ^)J r(n - a) ^.,-.-7^- ' ' ' >^^)- (7) 

Hence the splitting rule will be the sum of ^ for all different rii (not i) in (ni, . . . , n^), but 
they contain factors m,„. , so we can write it as sum over i G [A;]: 

sen/ N IS^^i^i , ^1(71- Hi) \\ ZnT{l - a) pp,* 




^",7v-'- ••'■-/ y^V^ n(n-l) ^ 'Vy r(n-Q^ ""'-"-^ 

n(n 



r^iyL"-"^' r(n-a) ^"■-"- 



7 + (1-0-7)373 —/ ."'*%■ I -F7Z rrqa,-a-j{'n'i,---,nk). 



D 



We can use the nested Chinese restaurants described in the proof to study the subtrees of 
the spine of T„. We have decomposed Tn into the subtrees S^"^ of the spine from the ROOT to 1 
and can, conversely, build T„ from S^f , for which we now introduce notation 



u 



U^^- 



We will also write ]Jj • 5°- when we join together unlabelled trees 5°- along a spine. The following 
unlabelled version of a spinal decomposition theorem will entail the Markov branching property. 

Lemma 11 (Spinal decomposition). Let {T°^,n > 1) be alpha-gamma trees, delahelled apart 
from label 1. For all n > 2, the tree T°^ has the same distribution as ]Jj • S°,, where 

• C„,-i = (A'^i, . . . ,Nl^^_j^) is a regenerative composition with decrement matrix q^'^i_^, 

• conditionally given L„_i = i and Ni = Ui, i £ [i], the sizes Nn > . . . > -A^iA'„ ^ ^ 1 form 
random compositions of ni with distribution q^-'y, independently for i G [£], 

• conditionally given also Kn^i = fcj and Nij = Uij, the trees 5?-, j G [fcj], i G [i], are 
independent and distributed as T°... 

Proof. For an induction on n, note that the claim is true for n = 2, since T°^ and \J- ■ S".?- are 
deterministic for n = 2. Suppose then that the claim is true for some n > 2 and consider T°^i. 
The growth rules (i)-(iii) of the labelled alpha-gamma tree Tn are such that 

• leaf n + 1 is inserted into a new bush or any of the bushes S^^ selected according to the 
rules of the ordered CRP with (7, 1 — a)-seating plan, 

• further into a new subtree or any of the subtrees S^"^ of the selected bush S'^^ according to 
the rules of a CRP with (a, — 7)-seating plan, 

• and further within the subtree 5^? according to the weights assigned by (i) and growth 
rules (ii)-(iii). 

These selections do not depend on T„ except via T°^. In fact, since labels do not feature 
in the growth rules (i)-(iii), they are easily seen to induce growth rules for partially labelled 
alpha-gamma trees T°^, and also for unlabelled alpha-gamma trees such as S°,. 

From these observations and the induction hypothesis, we deduce the claim for T°_^^. D 
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Proof of Proposition [IJ (a) Firstly, the distributions of the first splits of the unlabelled alpha- 
gamma trees T° were calculated in Proposition [TOt for < a < 1 and < 7 < a. 

Secondly, let < a < 1 and < 7 < a. By the regenerative property of the spinal 
composition Cn-i and the conditional distribution of T°^ given C„_i identified in Lemma [TTl 
we obtain that given Ni = m, K^^i = ki and Nij = nij, j G [ki], the subtrees S^,, j € [ki], 
are independent alpha-gamma trees distributed as T°^ . , also independent of the remaining tree 
^1,0 •= LJi>2 i ^ij^ which, by Lemma [TH has the same distribution as T°_^. 

This is equivalent to saying that conditionally given that the first split is into subtrees with 
ni > . . . > Ui > . . . > nk > 1 leaves and that leaf 1 is in a subtree with rij leaves, the delabelled 
subtrees 5^, . . . , S*^ of the common ancestor are independent and distributed as T°. respectively, 
j E [k]. Since this conditional distribution does not depend on z, we have established the Markov 
branching property of T°^ . 

(b) Notice that if 7 = 1 — a, the alpha-gamma model is the model related to stable trees, 
the labelling of which is known to be exchangeable, see Section [331 

On the other hand, if 7 / 1 — q, let us turn to look at the distribution of T^. 

12 13 




Probability: ^ Probability: ^Ef 

We can see the probabilities of the two labelled tree in the above picture are different although 
they have the same unlabelled tree. So if 7 7^ 1 — a, T„ is not exchangeable. D 

2.5 Sampling consistency and strong sampling consistency 

Recall that an unlabelled Markov branching tree T°, n >2 has the property of sampling consis- 
tency, if when we select a leaf uniformly and delete it (together with the adjacent branch point 
if its degree is reduced to 2), then the new tree, denoted by T° _^, is distributed as T°_^. Denote 
by d : B„ — > B„_i the induced deletion operator on the space B„ of probability measures on 
T° , so that for the distribution P„ of T°, we define d{Pn) as the distribution of T^ _^. Sampling 
consistency is equivalent to d{Pn) = Pn-i- This property is also called deletion stability in |12] . 



Proposition 12. The unlabelled alpha-gamma trees for < a < 1 and < 7 < a are sampling 
consistent. 

Proof. The sampling consistency formula (14) in p£j states that d{Pn) = Pn-i is equivalent to 



k 



i=l 



{Ui + l)(mn ^+i + 1) 

nrur. 



q{ni,...,nk) = V] — '- ^^^^^1 q{{ni, . . . ,ni -\- 1, . . . ,nk)^) 

^— • Tim 



-q{ni,... ,nfc,l) + -q{n - l,l)q{ni, . . . ,nk) (8) 



n n 



for all ni > . . . > rifc > 1 with ni -\- . . . -\- n^ = n — 1, where mj is the number of Ui, i € [k], 
that equal j, and where q is the splitting rule of T° ^ Pn- In terms of EPPFs ([1]), formula ^ 
is equivalent to 

k 

(1 -p{n- l,l))p(ni,...,nfc) = ^p((ni, . . . ,ni + l,...,nfe)^) + p{ni, . . . ,nk,l). (9) 

4 = 1 
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Now according to Proposition [U the EPPF of the alpha-gamma model with a < 1 is 
Pa^^ini, . . . ,nfc) = —j!— 7 + (1 - a - 7)7-3777-3157 5^™«?^^ pl%_^{ni, . . .,71^), 



(10) 



where Ta{n) = T{n — a)/T(l — a). Therefore, p^a^{ni, . . . , nj + 1, . . . , n^) can be written as 

(n — 2)(n — 1 — rii) — X^n^^v ^n^i; Zn pn* 

Pacini, . . . ,nfc) + 2(1 - Q - 7) -J- ^^y- ^ zfTTZ^Pa-a-^i'nu ■■■,nk) 



n{n — l){n — 2) 



Ta{n) 



Hi — a 
n — 1 — a 



(n - 1) (n - 2) - J2u^v ^nUy Zn PD« I . 

n(n — l)(n — 2) \-a\n) ' 



andpa,^(ni,... ,nfc,l) as 

p^'^^^(ni,...,nfc)+2(l-a-7) 

Jfc-l)a-7 
n — 1 — a 
Sum over the above formulas, then the right-hand side of Q is 

1 1 (7 + -(1 - a - t))) Pa,^(ni, . . . , n,,). 

n — 1 — a\ n // ' 

Notice that Pa'^in — 1, 1) = (7 + 2(1 — a — ^)/n) /{n — 1 — a). Hence, the splitting rules of the 
alpha- gamma model satisfy (l9|), which implies sampling consistency for a < 1. The case a = 1 
is postponed to Section 13.21 D 

Moreover, sampling consistency can be enhanced to strong sampling consistency [T6] by 
requiring that (T°_-^,T°) has the same distribution as {T°_^^T°). 

Proposition 13. The alpha-gamma model is strongly sampling consistent if and only if ^ = 
1 — a. 

Proof. For 7 = 1 — a, the model is known to be strongly sampling consistent, cf. Section 13.41 



^3 



^4 



If 7 7^ 1 — a, consider the above two deterministic unlabelled trees. 

P(r° = t°) = g-!|(2, 1, l)g-!^(l, 1) = (a - 7)(5 - 5a + 7)/((2 - a)(3 - a)). 
Then we delete one of the two leaves at the first branch point of t^ to get tg. Therefore 

p((T4Vi,r4°) = (titi)) = Inn = ti) = ^"77^^^7,^"V^ ■ 

2 2[2 — a)[S — a) 

On the other hand, if Tg = f^, we have to add the new leaf to the first branch point to get t^. 
Thus 

Fi{T^,Ti) = (titi)) = f^p(r3° = t^) = ^"r/^V,^"V^ - 

6 — a [2 — a)[6 — a) 

It is easy to check that F{{Tl_^,T^) = (t^,t^)) ^ F{{T^,T^) = (t^,!^)) if 7 / 1 - a, which 



means that the alpha- gamma model is then not strongly sampling consistent. 



D 
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3 Dislocation measures and asymptotics of alpha-gamma trees 

3.1 Dislocation measures associated with the alpha-gamma-sphtting rules 

Theorem [2] claims that the alpha- gamma trees are sampling consistent, which we proved in 
Section [231 and identifies the integral representation of the splitting rule in terms of a dislocation 
measure, which we will now establish. 

Proof of Theorem \^ Firstly, we make some rearrangement for the coefficient of the sampling 
consistent splitting rules of alpha-gamma trees identified in Proposition [TUl 

7 + (1 - a - 7)-^ 7T y] nirij 

n[n — 1) ^ 

(n -I- 1 — a — 7)(n — a — 7) / ,^ , / v^ , „ v^ „ ^ 

=- — ^^) — ^ (^7+(i-a-7) (^i:^.+2|:i?.+c 

_ [nj -a){nj -a) 

^'' (n -|- 1 — a — 7) (n — a — 7) ' 

{m - a){{k - l)a - j) 
Hi — 



where 



C 



(n -|- 1 — a — 7) (n — a — 7) ■ 
{{k — l)a — 'y){ka — 7) 



(n -|- 1 — a — 7) (n — a — 7) 
Notice that Bip^^_^^_^{ni, . . . , n^) simplifies to 

{m - a){{k - l)a - 'j) a'=-2r(/c - 1 - 7/a) 



(n + 1 - a - 7)(n - Q - 7) ZnT{l - 7/a) 

Zn+2 a>'-^T{k - 7/a) 



Ta{ni) ...Ta{nk) 

Tairii) . . . Taiui + 1) . . . Tairik) 



Zn{n + 1 - a - 7) (n - a - 7) Zn+2^{1 - j/a 

Zn+2 PD* / ,1 1 \ 

= -^—Pa-a-'y{ni,---,ni + l,...,nfc,l), 

Z-n 

where Vain) = Y{n — a)/r(l — a) and Zn = ZnOLT{\ — ^/a)/T{n — a — 7) is the normalisation 
constant in (j3|) for v = PD* _^_„, as can be read from [T71 Formula (17)]. According to ([1]), 

p™;_>i,...,nfc) = i/ Y. n^-'PD;_„__,(d.). 



ii,...,ife>l i=l 
distinct 



Thus, 

k 



Z J si 

" ii,...,ife>l«=l \MG{n,...,ifc},'U^{n,...,ifc} 

distinct 



-BipP°*„_^(ni,...,nfc) = ~- X] 114 1 X^ s„s„ PD;_„_^(ds) 

Similarly, 

Zn JS^ • .-^17 1 \ _r- ■ 1 / / 



k 

Zn JS^ 

distinct 

k 



ii,...,ik>l 1=1 \u,vG{ii ,...,iii}:ujtv 

distinct 



^« •^'^i j^_^^^ j^>;L,^;^ \«,1.0{il,...,ifc}:M7^1, ' 



distinct 
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Hence, the EPPF p^^{ni, . . . ,nfc) of the samphng consistent sphtting rule takes the fohowing 
form: 






PB*, Jds), (11) 



distinct 

where Yn = n{n — l)rQ,(n)ar(l — ^ /a)/T{n + 2 — — 7) is the normalization constant. Hence, 
we have Ua^-yids) = f 7 + (1 - a - 7) Yli^j SiSjjY'T)*a-a-y{ds). D 

3.2 The alpha-gamma model when a = 1, spine with bushes of singleton-trees 

Within the discussion of the alpha-gamma model so far, we restricted to < a < 1. In fact, we 
can still get some interesting results when a = 1. The weight of each leaf edge is 1 — a in the 
growth procedure of the alpha-gamma model. If a = 1, the weight of each leaf edge becomes 
zero, which means that the new leaf can only be inserted to internal edges or branch points. 
Starting from the two leaf tree, leaf 3 must be inserted into the root edge or the branch point. 
Similarly, any new leaf must be inserted into the spine leading from the root to the common 
ancestor of leaf 1 and leaf 2. Hence, the shape of the tree is just a spine with some bushes of 
one-leaf subtrees rooted on it. Moreover, the first split of an n-leaf tree will be (n — /c + 1, 1, . . . , 1) 
for some 2 < k <n — 1. The cases 7 = and 7=1 lead to degenerate trees with, respectively, 
all leaves connected to a single branch point and all leaves connected to a spine of binary branch 
points (comb). 

Proposition 14. Consider the alpha-gamma m,odel with a = 1 and < 7 < 1. 

(a) The model is sampling consistent with splitting rules 

"yT^{k — l)/{k — 1)!, if2<k<n — 1 and (rii, . . . ,nk) = (n — A; + 1, 1, . . . , 1); 
r^(n- l)/(n - 2)!, if k = n and {ni, . . . ,nk) = (1, . . . ,1); (12) 

0, otherwise, 

where ni > . . . > n/. > 1 and ni + . . . + n/. = n. 

(b) The dislocation measure associated with the splitting rules can be expressed as follows 

/(si, 0, . . .)i/i,^(ds) = / /(si, 0, . . .) (7(1 - si)~'-^dsi + 6oidsi)) . (13) 

<Si Jo 

In particular, it does not satisfy iy{{s € 5^ : si + S2 + . . . < 1}) = 0. 

Proof, (a) We start from the growth procedure of the alpha-gamma model when a = 1. Consider 
a first split into (n — k + 1,1, . . . ,1) for some labelled n-leaf tree. Suppose its first branch point 
is created when the leaf I is inserted to the root edge for I > 3. At this time the first split is 
(/ — 1, 1) with a probability 7/(/ — 2) as a = 1. In the following insertion, leaves I + 1, . . . ,n 
have been added either to the first branch point or to the subtree with I — 1 leaves at this time. 
Hence the probability that the first split of this tree is (n — A; + 1, 1, . . . , 1) is 

(n-fc-l)! ^ ,, 
14 



which does not depend on /. Notice that the growth rules imply that if the first split is (n — 
k + 1,1, ... ,1) with k < n — 1, then leaves f and 2 will be located in the subtree with n — k + 1 
leaves. There are ( "l_;^) labelled trees with the above first split. Therefore, 

g->-fc + i,i,...,i) = f ^:\ y^"^~,^^> r,(fc-i) = 7r,(fc-i)/(fc-i)!. 



n-k-lj (n-2)! 

On the other hand, there is only one n-leaf labelled tree with a first split (1, . . . , 1) and in this 
case, all leaves have been added to the only branch point. Hence 

g^,;^(l,...,l)=r,(n-l)/(n-2)!. 

For sampling consistency, we check criterion ([8]), which reduces to the two formulas 



il-qT^^in-l,l))qT^^in-k,l,...,l) = ^qT^^in - k,l, . . . ,1) 

n — k + 1 

(i-C>-i,i))C(i'---'i) = yc:^{2,i,...,i) + qTm,...,i 



k 
—( 

n 

2 

— ( 
n ■ 



(b) According to (fT2]) . 

^^ ^ 7^(n- fc + 2,fc-l-7) 



n — k + Ij n! 
_!_/ n 
Y\n-k + l 

1 { n 
Yn\n-k + l 



where Yn = n]/T^{n + 1). Similarly, 

qT^^{l, . . . , 1) = ^ /■ (n(l - s,r-hi + (1 - s,r) ((7(1 - s^)-'-''ds, + do{dsi)) . (15) 

^n Jo 

Formulas (|14p and (jlSp are of the form of [161 Formula (2)], which generalises (JH) to the case 
where u does not necessarily satisfy v{{s G 5^ : S1+S2+. . . < 1}) = 0, hence ui^^ is identified. D 

3.3 Continuum random trees and self-similar trees 

Let B cN finite. A labelled tree with edge lengths is a pair 1? = (t,rj), where t G T^ is a labelled 
tree, r] = {r]A, ^ G t \ {root}) is a collection of marks, and every edge C ^ A of t is associated 
with mark r]A G (0, cxo) at vertex A, which we interpret as the edge length of C — s- ^. Let 0^ be 
the set of such trees (t, ry) with t G T^. 

We now introduce continuum trees, following the construction by Evans et al. in ^. A com- 
plete separable metric space (r, d) is called an M-tree, if it satisfies the following two conditions: 

1. for all x,y G r, there is an isometry ip^^y : [0,d{x,y)] — > r such that ipx,y{0) = x and 
y^x,y{d{x,y)) =y, 

2. for every injective path c : [0,1] — > r with c(0) = x and c(l) = y, one has c([0, 1]) = 
V^x,y{[0,d{x,y)]). 

We will consider rooted M-trees {T,d,p), where /) G r is a distinguished element, the root. We 
think of the root as the lowest element of the tree. 
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We denote the range of ipx^y by [[x, y]] and call the quantity d{p, x) the height of x. We say 
that X is an ancestor of y whenever x € [[p, y]]. We let xAy be the unique element in r such that 
[[/J, x]] n [[/9, y]] = [[p,x A y]], and call it the highest common ancestor of x and y in r. Denoted 
by {Tx,d\r^,x) the set of y € r such that x is an ancestor of y, which is an M-tree rooted at x 
that we call the fringe subtree of r above x. 

Two rooted M-trees {T,d, p),{t' ,d' , p') are called equivalent if there is a bijective isometry 
between the two metric spaces that maps the root of one to the root of the other. We also denote 
by the set of equivalence classes of compact rooted M-trees. We define the Gromov-Hausdorff 
distance between two rooted M-trees (or their equivalence classes) as 

dcH (t,t') = inf{dH(r,T')} 

where the infimum is over all metric spaces E and isometric embeddings t <Z E oi t and t' C E 
of t' with common root p E E; the Hausdorff distance on compact subsets of E is denoted by 
^H- Evans et al. [9] showed that (©jdcn) is a complete separable metric space. 

We call an element x G r, x 7^ p, in a rooted M-tree r, a /ea/if its removal does not disconnect 
T, and let C{t) be the set of leaves of r. On the other hand, we call an element of r a branch 
point, if it has the form x Ay where x is neither an ancestor of y nor vice-visa. Equivalently, 
we can define branch points as points disconnecting r into three or more connected components 
when removed. We let B{t) be the set of branch points of r. 

A weighted M-tree (r, p) is called a continuum tree [1\, if /x is a probability measure on r and 

1. p is supported by the set C{t), 

2. p has no atom, 

3. for every x G t\C{t), p{tx) > 0. 

A continuum random tree (CRT) is a random variable whose values are continuum trees, defined 
on some probability space {Q,A,¥). Several methods to formalize this have been developed 
[2l[ini[l3]. For technical simplicity, we use the method of Aldous [2]. Let the space ii = £i(N) be 
the base space for defining CRTs. We endow the set of compact subsets of ii with the Hausdorff 
metric, and the set of probability measures on li with any metric inducing the topology of weak 
convergence, so that the set of pairs (T, p) where T is a rooted M-tree embedded as a subset of 
ii and /i is a measure on T, is endowed with the product cj-algebra. 

An exchangeable V^-valued fragmentation process {Jl{t),t > 0) is called self-similar with 
index a E M if given n(t) = vr = {vtj, z > 1} with asymptotic frequencies IvtjI = lim„^oo n~^#[n]n 
vTj, the random variable n(t -|- s) has the same law as the random partition whose blocks are 
those of TTj n n(*)(|7rj|°s), i > 1, where (IlW, i > 1) is a sequence of i.i.d. copies of (n(t), t > 0). 
The process (|n(t)|''-,t > 0) is an S^ -valued self- similar fragmentation process. Bertoin ^ proved 
that the distribution of a "P^-valued self-similar fragmentation process is determined by a triple 
{a,c, v), where a € M, c > and 1/ is a dislocation measure on S^ . For this article, we are only 
interested in the case c = and when v{si -|- S2 + . . . < 1) = 0. We call (a, v) the characteristic 
pair. When o = 0, the process {Jl{t),t > 0) is also called homogeneous fragmentation process. 

A CRT (T, p) is a self-similar CRT with index a = —7 < if for every t > 0, given 
{p(T^),i > 1)) where T^,i > 1 is the ranked order of connected components of the open set 
{x G r : d{x,p{T)) > t}, the continuum random trees 

are i.i.d copies of {T,p), where p{Tl)~^Tl is the tree that has the same set of points as 7^% but 
whose distance function is divided by p(T^)"' . Haas and Miermont in [15] have shown that there 
exists a self-similar continuum random tree T'r-Y^u) characterized by such a pair (7, v), which can 
be constructed from a self-similar fragmentation process with characteristic pair (7, z^). 
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3.4 The alpha-gamma model when 7 = 1 — a, sampling from the stable CRT 

Let (T, /?, /i) be the stable tree of Duquesne and Le Gall [7] . The distribution on 6 of any CRT 
is determined by its so-called finite-dimensional marginals: the distributions of T^^j k > 1, the 
subtrees TZ^ C T defined as the discrete trees with edge lengths spanned hy p,Ui, . . . ,Uk, where 
given {T,iJ,), the sequence C/j S T, i > 1, of leaves is sampled independently from fi. See also 
[HI El [m [T71 [H] for various approaches to stable trees. Let us denote the discrete tree without 
edge lengths associated with TZ^ by T^ and note the Markov branching structure. 

Lemma 15 (Corollary 22 in [16]). Let 1/a G (1,2]. The trees Tn, n > 1, sampled from the 
{l/a)-stahle CRT are Markov branching trees, whose splitting rule has EPPF 

stable. „ ^ _ «'-'r(fc-i/a)r(2-a) ' r(n,-a) 

Pi/„ ini,...,nfcj- r(2 - l/a)r(n - a) H r(l - a) 

for any k >2, ni > 1, . . . ,nfc > 1, n = ni, . . . ,71^. 

We recognise p^*/^'*^ = p™*i in ([2]), and by Proposition [H we have p^_i = P^al-a- This 
observation yields the following corollary: 

Corollary 16. The alpha- gamma trees with 7 = 1 — a are strongly sampling consistent and 
exchangeable. 

Proof. These properties follow from the representation by sampling from the stable CRT, partic- 
ularly the exchangeability of the sequence Ui,i > 1. Specifically, since Ui,i > 1, are conditionally 
independent and identically distributed given {T,fj,), they are exchangeable. If we denote by 
Cn,-i the random set of leaves Cn = {Ui, . . . , Un} with a uniformly chosen member removed, 
then {Cn-i,Cn) has the same conditional distribution as {Cn-i,Cn). Hence the pairs of (un- 
labelled) tree shapes spanned by p and these sets of leaves have the same distribution - this is 
strong sampling consistency as defined before Proposition [T3j D 

3.5 Dislocation measures in size-biased order 

In actual calculations, we may find that the splitting rules in Proposition [1] are quite difficult and 
the corresponding dislocation measure i' is always inexplicit, which leads us to transform i/ to a 
more explicit form. The method proposed here is to change the space S^ into the space [0, 1]^ 
and to rearrange the elements s G 5^ under u into the size-biased random order that places 
Si-^ first with probability Sj^ (its size) and, successively, the remaining ones with probabilities 
Sj./(1 — Sj^ — ■ ■ ■ — Si_-^) proportional to their sizes Sj. into the following positions, j > 2. 

Definition 2. We call a measure u^^ on the space [0, 1]^ the size-biased dislocation measure 
associated with dislocation measure u, if for any subset ^1 x ^2 x . . . x A^ x [0, 1] of [0, 1] , 

l.^\A^xA2X...xA,x[0,lf)= J^ / J'^ ' " ;f; Mds) {16) 



ii,...,ik> 
distinct 



for any /c E N, where i^ is a dislocation measure on iS-*^ satisfying ij{s G S^ : si -|- S2 + . . . < 1) =0. 
We also denote by i^fiAi x A2 x . . . x Ak) = ly^'^iAi x ^2 x . . . x ^fc x [0, 1]^) the distribution 
of the first k marginals. 

The sum in (fT6]) is over all possible rank sequences (ii, . . . , i^) to determine the first k entries 
of the size-biased vector. The integral in (J16p is over the decreasing sequences that have the jth 
entry of the re-ordered vector fall into Aj, j G [k]. Notice that the support of such a size-biased 
dislocation measure v^^ is a subset of S^^ := {s E [0, 1]^ : J2i^i ^i — !}• ^^ '^^ denote by s^ the 
sequence s E S^^ rearranged into ranked order, taking (|16|) into formula ([!]), we obtain 
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Proposition 17. The EPPF associated with a dislocation measure v can he represented as: 

Zr,. JfO,ll'= ._i Ti 



j=l 1=1 



where u is the size-biased dislocation measure associated with v, where ni > . . . > n^ > 1, k > 
2, n = ni + . . . + nfc and x = {xi , . . . , j;^) . 

Now turn to see the case of Poisson-Dirichlet measures PD* q to then study i^a-y- 

Lemma 18. If we define GEM* g as the size-biased dislocation measure associated with PD* ^ 
for < a < 1 and 9 > —2a, then the first k marginals have joint density 

an2+e/a) (i-Eli^.r'^-n.ti^j" 

gem_,fl(xi, . . . ,Xh) = i ; ■■ — — , 

T{l-a)T{e + a + l)Y\UB{l-a,e + ja) n'=i(l " ELi ^0 

(17) 

where B{a, b) = L x^ ^(1 ~ x) ^dx is the beta function. 

This is a simple ir-finite extension of the GEM distribution and (jl7|) can be derived anal- 
ogously to Lemma [71 Applying Proposition [TTl we can get an explicit form of the size-biased 
dislocation measure associated with the alpha-gamma model. 

Proof of Proposition [^ We start our proof from the dislocation measure associated with the 
alpha-gamma model. According to ([5]) and (|16p . the first k marginals of u^^ are given by 

utiA,x...xAj,) 

distinct 

= -fD + {l-a--f){E-F), 
where 

n,...,ifc>i-^i«e5i: s,^eAu...,s,,^eA,} llj=i(l - L]=i ^ij 

distinct 

= GEM;„„„^(Ai X... X Afc), 

/ fc \ 



distinct 



n,...,i*>l-^^^^'5'^^H^^i'-'^»*^^^U u=l J Uj=li^-U=lS^,) 



JAlX...XAk \ j^;^ / 



distinct 






ii, I 1 '5?,i ' • • Si 



y / '^r , "• T PD;_,_-,(ds) 

distinct 



7ai; 



— ± GEM* _„_^((i(2;i, . . . , Xk+i))- 
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Applying ([T7D to F (and setting 6 = —a — 7), then integrating out x^+i, we get: 



JAi 



.,..,.. I hI -")--, ''' -^'^ GEMl-.-.C-)- 



Summing over D,E^F^ we obtain the formula stated in Proposition HI D 

As the model related to stable trees is a special case of the alpha-gamma model when 
7 = 1 — a, the sized-biased dislocation measure for it is 

For general (a, 7), the explicit form of the dislocation measure in size-biased order, specif- 
ically the density Qa,-^ of the first marginal of t'o'^-y, yields immediately the tagged particle @] 
Levy measure associated with a fragmentation process with alpha-gamma dislocation measure. 



Corollary 19. Let (Jl°''^{t),t > 0) he an exchangeable homogeneous Vn-valued fragmentation 

ya,'y Then, for the size |n?A'^ 



process with dislocation measure i'a,'y Then, for the size |n?.'J(t)| of the block containing i > 1, 



the process ^{i){t) = —log 111.^(^)1, t>Q, is a pure-jump subordinator with Levy measure 

K,,{dx) = e-,„,,(e-)dx = r(i -^L)"r(/ -^) ^' " '"^'^''"' ^'"^'^'"" 

X ("7 + (1 - a - 7) f 2e-"(l - e-") + ^^^(1 - e-^f\\ dx. 

3.6 Convergence of alpha-gamma trees to self-similar CRTs 

In this subsection, we will prove that the delabelled alpha-gamma trees T°, represented as 
M-trees with unit edge lengths and suitably rescaled converge to CRTs as n tends to infinity. 

Lemma 20. // (T°)„>i are strongly sampling consistent discrete fragmentation trees associated 
with dislocation measure i'a,~a--y, then 

rpo 

in the Gromov-Hausdorff sense, in probability as n —> 00. 

Proof. Theorem 2 in [16] says that a strongly sampling consistent family of discrete fragmenta- 
tion trees {T°)n>i converges in probability to a CRT 

rjio 

n^-£(n)r(l - 7^) ^ '^^^'■""'^ 
for the Gromov-Hausdorff metric if the dislocation measure u satisfies following two conditions: 

i^{si<l-e)=e-"'-'i{l/e); (18) 

/ y^Sj|lnsj|''z^((is) < 00, (19) 

where p is some positive real number, 71, G (0, 1), and x 1— > i{x) is slowly varying as x — > 00. 
By virtue of (19) in [16], we know that (jlSp is equivalent to 

A{[x,oo)) =x~'^''£{l/x), as X i 0, 
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where A is the Levy measure of the tagged particle subordinator as in Corohary \W\ So, the 
dislocation measure i'a,-y satisfies p^ with i{x) — > 7ar(l — 7/a)/r(l — a)r(2 — 7) and 7v„,^ = 7- 
Notice that 

/• /'OO 

/ y_]'5j|lnsi|''fQ,^((is) < / x^Aayidx). 

JS^ i>2 ' -^0 

As X — > 00, Aa^-y decays exponentially, so Va,"/ satisfies condition ([19]). This completes the 
proof. D 

Proof of Corollary \M The splitting rules of T° are the same as those of T°, which leads to 
the identity in distribution for the whole trees. The preceding lemma yields convergence in 
distribution for T°. D 

4 Limiting results for labelled alpha-gamma trees 

In this section we suppose < a < 1 and < 7 < a. In the boundary case 7 = trees grow 
logarithmically and do not possess non-degenerate scaling limits; for a = 1 the study in Section 
13.21 can be refined to give results analogous to the ones below, but with degenerate tree shapes. 

4.1 The scaling limits of reduced alpha-gamma trees 

For T a rooted M-tree and xi, . . . ,Xn G r, we call R{t, xi, . . . , a;„) = ljr=i[[/'' ^iW ^^^ reduced 
subtree associated with t,xi, . . . ,Xn, where p is the root of r. 

As a fragmentation CRT, the limiting CRT (T"'"^,/!) is naturally equipped with a mass 
measure fj, and contains subtrees TZk, k > 1 spanned by k leaves chosen independently according 
to fi. Denote the discrete tree without edge lengths by T^ - it has exchangeable leaf labels. Then 
TZn is the almost sure scaling limit of the reduced trees R{Tn, [k]), by Proposition 7 in [16]. 

On the other hand, if we denote by r„ the (non-exchangeably) labelled trees obtained via 
the alpha-gamma growth rules, the above result will not apply, but, similarly to the result for 
the alpha model shown in Proposition 18 in [16], we can still establish a.s. convergence of the 
reduced subtrees in the alpha-gamma model as stated in Theorem [5] and the convergence result 
can be strengthened as follows. 

Proposition 21. In the setting of Theorem\^ 

{rr"'R{Tn, [k]),n~'^Wn,k) -^ (T^k, Wk) a.s. as n^ 00, 

in the sense of Gromov-Hausdorff convergence, where Wn,k is the total number of leaves in 
subtrees ofTn\R{Tn, [k]) that are linked to the present branch points of R(Tn, [k]). 

Proof of TheorenilM and Proposition\21[ Actually, the labelled discrete tree R{Tn, [k]) with edge 
lengths removed is T^ for all n. Thus, it suffices to prove the convergence of its total length and 
of its edge length proportions. 

Let us consider a first urn model, cf. [llj, where at level n the urn contains a black ball for 
each leaf in a subtree that is directly connected to a branch point of R{Tn, [k]), and a white ball 
for each leaf in one of the remaining subtrees connected to the edges of -R(T„, [k]). Suppose that 
the balls are labelled like the leaves they represent. If the urn then contains Wn^k = '"i- white 
balls and n — k — m black balls, the induced partition of {fc -|- 1, . . . , n} has probability function 

, , , Tin — m — a — w)T{w + m)T{k — a) Bin — m — a — ui,w + m) 

pirn, n — k — m) = — - — r — = 

r(K — a — w)T{w)Ti^n — a) B[k — a — w,w) 

where w = k{l — a) + i'j is the total weight on the k leaf edges and i other edges of T^. As 
n -^ 00, the urn is such that Wn^k/n -^ W^ a.s., where W^ ~ beta((A; — 1)q — /7, k{l — a) -|- ^7). 
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We will partition the white balls further. Extending the notions of spine, spinal subtrees 
and spinal bushes from Proposition [TOl (k = 1), we call, for fc > 2, skeleton the tree S(Tn, [k]) of 
Tn spanned by the ROOT and leaves [k] including the degree-2 vertices, for each such degree-2 
vertex v G S{Tn, [k]), we consider the skeletal subtrees S^^ that we join together into a skeletal 

bush S^. Note that the total length Lj^ of the skeleton S{Tn, [k]) will increase by 1 if leaf n + 1 

in Tn+i is added to any of the edges of S{Tn, [k]); also, L^ is equal to the number of skeletal 
bushes (denoted by Kn) plus the original total length of k + i of Tf^. Hence, as n — > oo 



The partition of leaves (associated with white balls), where each skeletal bushes gives rise to a 
block, follows the dynamics of a Chinese Restaurant Process with (7, w)-seating plan: given that 
the number of white balls in the first urn is m and that there are Km '■= Kn skeletal bushes on 
the edges of S{Tn, [k]) with rii leaves on the ith bush, the next leaf associated with a white ball 
will be inserted into any particular bush with rij leaves with probability proportional to nj — 7 
and will create a new bush with probability proportional to w + Km'J- Hence, the EPPF of this 
partition of the white balls is 

Applying Lemma[8]in connection with (f20]) . we get the probability density of Lk/W^ as specified. 
Finally, we set up another urn model that is updated whenever a new skeletal bush is created. 
This model records the edge lengths of R(Tn, [k]). The alpha-gamma growth rules assign weights 
1 — a + (nj — 1)7 to leaf edges of R{Tn, [k]) and weights rii^ to other edges of length rij, and each 
new skeletal bush makes one of the weights increase by 7. Hence, the conditional probability 
that the length of each edge is (ni, . . . , rik+i) at stage n is that 



k+e 



nr=iri-.(n.)ns+ir7( 



rii 



Then Dj^ converge a.s. to the Dirichlet limit as specified. Moreover, L^ Dj^ — > L^Dk a.s., 
and it is easily seen that this implies convergence in the Gromov-Hausdorff sense. 

The above argument actually gives us the conditional distribution of Lk/W2 given T^ and 
Wk, which does not depend on W^- Similarly, the conditional distribution of D^ given given 
Tfc, Wk and Lk does not depend on Wk and Lk- Hence, the conditional independence of Wk-, 
Lk/W^ and Dk given Tk follows. D 

4.2 Further limiting results 

Alpha-gamma trees not only have edge weights but also vertex weights, and the latter are in 
correspondence with the vertex degrees. We can get a result on the limiting ratio between the 
degree of each vertex and the total number of leaves. 

Proposition 22. Let (ci + 1, . . . , q -|- 1) be the degree of each vertex in Tk, listed by depth first 
search. The ratio between the degrees in Tn of these vertices and n" will converge to 

Ck = (Cfc,i, . . . , Ck/) = WkMkD'j^, where D'^ ~ Dirichlet (ci - 1 - 7/a, . . . , q - 1 - j/a) 

and Mk are conditionally independent of Wk given Tk, where Wk = 1 — Wk, and Mk has density 

i [w/a + Ij 
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w = (k — l)a — ^7 is total branch point weight in T^ and Qais) is the Mittag-Lejfler density. 

Proof. Recall the first urn model in the preceding proof which assigns colour black to leaves 
attached in subtrees of branch points of T^. We will partition the black balls further. The 
partition of leaves (associated with black balls), where each subtree Sfj of a branch point v G 
R{Tn,[k]) gives rise to a block, follows the dynamics of a Chinese Restaurant Process with 
{a,w)-seatmg plan. Hence, the total degree Cl°^{n)/Wj^i^ -^ Mk a.s., where C^°*(n) is the sum 
of degrees in T„ of the branch points of T^, and Wn,k = n — k — Wn,k is the total number of 
leaves of T„ that are in subtrees directly connected to the branch points of T^. 

Similarly to the discussion of edge length proportions, we now see that the sequence of degree 
proportions will converge a.s. to the Dirichlet limit as specified. Since 1 — Wk is the a.s. limiting 
proportion of leaves in subtrees connected to the vertices of T^ . D 

Given an alpha-gamma tree T„, if we decompose along the spine that connects the ROOT to 
leaf 1, we will find the leaf numbers of subtrees connected to the spine is a Chinese restaurant 
partition of {2, . . . , n} with parameters (a, 1 — a). Applying Lemma [71 we get following result. 

Proposition 23. Let {Tn,n > 1) be alpha-gamma trees. Denote by {Pi,P2, . . .) the limiting 
frequencies of the leaf numbers of each subtree of the spine connecting the ROOT to leaf 1 in the 
order of appearance. These can be represented as 

(Pi, P2, . . .) = {Wi,WiW2,WiW2W^, . . .) 

where the Wi are independent, Wi has beta(l — a, 1 + (i — l)a) distribution, and Wi = 1 — Wi. 

Observe that this result does not depend on 7. This observation also follows from Proposition 
[6l because colouring (iv)™' and crushing (cr) do not affect the partition of leaf labels according 
to subtrees of the spine. 
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